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Abstract 



Conditional quantiles provide a natural tool for reporting results from regression analyses 
based on semiparametric transformation models. We consider their estimation and construc- 
tion of confidence sets in the presence of censoring. 
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1 Introduction 



One-sided transformation models provide a popular tool for regression analysis of failure time 
data. These models assume that the conditional distribution of a failure time T given a vector 
of covariates Z has distribution function 

F{t\z) = F{r{t),6\z) ^ a.s. z, (1) 

where fi is the marginal distribution of covariates, T is an unknown increasing function mapping 
the support of the marginal distribution of T onto the positive half-line, and = {F(x, 6\z) : 
6 £ @,x > 0} is a parametric family of conditional cdf's supported on i?^. The most common 
choice corresponds to the scale regression model 

F{t\z) = G{T{t)e^^') n a.s. z, (2) 

where G is a known distribution function. In particular, the proportional hazard model is of 
this form. In this case G represents exponential distribution and the unknown transformation 
r is the so-called baseline cumulative hazard function. Proportionality of hazards means that 
the conditional distribution of T given Z = z has hazard rates h{t\z) satisfying 

e-^^'^ _ h{t\z2) 
~ h{t\zi) 
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for any two distinct covariate levels zi and Z2. This interpretation of parameters (T,6) is lost 
in other transformation models of type (2) because the shape of the function T depends on the 
distribution G. 



It is convenient to consider quantiles 

Q{p\z) = M{t:F{t\z)>p} 

of the conditional distribution of T given Z = z as an alternative parameter. In transformation 
models (2), we have 

Q(p|z) = r-i(e-^"^G-i(p)) (3) 

for all p £ (0, 1) and fi almost all z. Thus the conditional quantiles are monotone in each 
coordinate of the vector z = {zi, . . . , Zd)- In addition, the direction of monotonicity does not 
depend on p: 

sign [-^QiP\^)] = sign (-^fc) for k = l,...,d . 

Invariance of the model with respect to the group of increasing transformations implies also 
that for any pi ^ p2 we have 

r(Qfe|.)) _ ^ ^^^^ ^ 



and for any zi ^ Z2 



T{Q{p2\z)) G-\p2) 



(5) 



for all p G (0, 1). These three identities can be perhaps better understood by noting that (2) 
represents a linear regression model 

logr(r) = -0^Z + e, 

where Z and e are independent and expe has distribution function G. In linear regression 
models assuming that the transformation V is known and equal to V{t) = the conditional 
quantiles are linear in z but the slope of the regression does not change with p. Likewise, the 
identities (4) and (5) have their additive analogue. However, if the transformation is unknown, 
then the model is much more difficult to interpret in terms of the parameters {0,T). 



Properties of quantile regression in the proportional hazard model are further discussed 
in Koenker and Geling (2001) and Portnoy (2003). In particular, Koenker and Geling (2001) 
proposed to measure the local effect of the regression coefficient on the conditional quantile p 
in terms of a parameter EZ) = [6fe(p, EZ), A; = 1, . . . , c/], where 

hijp.z) = -^Q{p\z) . 
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This parameter can be applied to any regression model. In (2) we have 



b{p, Ez) = -e 



T 



G-\p) 



^{Qip\EZ)) 



provided the unknown transformation has density 7 with respect to Lebesgue measure in a 
neighbourhood of Q{p\EZ). While b{p,EZ) is proportional to the regression coefficient 6, the 
local effect of the regression coefficient is determined by the shape of the density 7. Portnoy 
(2003) considered direct modeling of the conditional quantiles under the assumption that F is 
the identity map. His model takes form 



Q{p\z)=e'^y , 

so that for fixed p the log-conditional quantiles are linear in z, but also the quantile regression 

coefficient changes with p. However, the choice of the identity map may be problematic. For 
other choices of the transformation, we have Q{p\z) = T~^{ex.-p9{p)'^z). Koenker and Geling's 
measure is given by 

It shows that the model is more flexible than the semiparamctric transformation model (2), 
but it is not clear how to estimate the transformation function in this setting. 

In many practical situations researchers may be also interested in the conditional distribu- 
tion of T given ^{Z), where (p is a known function. In particular, ii Z = (V, W) represents a 
high-dimensional covariatc, then the choice '^^{Z) = V may correspond to a low-dimensional 
vector of "main" covariates. If V and W are dependent variables, then the conditional dis- 
tribution of T given V follows the more flexible transformation model (1). For example, if 
(2) represents the proportional hazard model with parameters 9 = {9i, 62) and the conditional 
distribution of exp[^2^I^] given V is gamma with shape and scale equal to exp,^(t;) for a pos- 
sibly nonlinear function ^ of v, then the marginal conditional distribution of T given V has 
distribution function of the form (1) with 

F{x, 9u^\v) = 1 - (1 + exp[^ii; + ^(i;)]^;)- ""pI-^^'^^I . 



The ratio of conditional hazards is 

h{x\v2) _ 
h{x\vi) ~ 



1 + e^i^i+^(''i)F(a 

1 + f>^lV2+(.{v2)Y{2 



For a; = the right-hand side is equal to exp[— — ^2)] and changes to exp[^(fi) — ^(^2)] 
as a; t 00. It represents an increasing function if 9i{vi — V2) > ^(■^2) — ^{vi) and a decreasing 
function, if the inequality is reversed. The conditional quantile function is equal to 

Qip\v)=T-\F-^ip,9i,^\v)) 
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where 

F-\p,9,^\v) = eM-C{v) - - 1] . 

If ^(f) is constant for almost all v, then we obtain the model (2). Otherwise the shape of the 
quantile function changes with p. The ratios of the transformed quantiles (4) and (5) are no 
longer constant in v and p, respectively. 

In the general case, the conditional distribution of e^^^ given V will not have a simple 
analytical form, even if specified via a parametric model. However, quantile regression of the 
marginal conditional distributions of the failure time T can also be estimated by combining 
nonpar ametric regression with estimates of the parameters {0,T). 

In this paper we consider estimation of the conditional quantiles of T given (p{Z), where 
ip is a, function assuming a finite number of values. In particular, if Z = (Zi, . . . , Z^) has 
one or more discrete components, then results of this paper can be applied to estimation of 
quantiles of the marginal conditional distributions of T given any discrete component of Z. 
On the other hand in the case of continuous covariates estimation of the marginal conditional 
distribution and quantiles requires smoothing and may be difficult to accomplish in moderate 
or heavily censored samples. In such circumstances grouping observations into a small number 
of categories provides an alternative. For purposes of estimation of the parameters (9, T) in 
transformation models (1) and (2), we use procedures proposed by Bogdanovicius and Nikulin 
(1999) and Dabrowska (2005). The approach allows for estimation of quantiles of the condi- 
tional distribution of T given Z = z much in the same way as in the proportional hazard model, 
i.e. based on the substitution of estimates of (9, T) into (3) (Dabrowska and Doksum, 1987, 
Burr and Doss, 1993). Here we derive asymptotic structure of the estimates of the conditional 
quantiles under the assumption that (/? is a finite valued function, and consider construction 
of pointwise and simultaneous confidence sets. We also develop a Gaussian multiplier method 
for setting simultaneous confidence sets for the conditional quantile function. It extends the 
Gaussian multiplier method for setting confidence bands for the conditional survival function 
in the proportional hazard model (Lin, Fleming and Wei (1994)) to transformation models of 
type (1). In Section 3 we use data from a Vateran's Administration lung cancer clinical trial 
(Kalbfleisch and Prentice, 2000) to illustrate the results. Section 4 contains proofs. 



2 Estimation 



We assume that the vector {X,5,Z) represents a nonnegative withdrawal time (X), a binary 
withdrawal indicator (6 = 1 for failure and 6 = for loss-to-follow-up) and covariate (Z). The 
triple {X, 6, Z) is defined on a complete probability space (Q, P) and (X, 6) are given by 
X = T AT, 6 = 1{X = T), where T and T represent failure and censoring times. The variables 
T and T are conditionally independent given Z and the conditional cumulative hazard function 
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of T given Z is of the form 

H{t\z) = A{To{t),9o\z) n a.s. z, 

where Tq is an unbounded continuous increasing function, {A(x, 9\z) : ^ G 0} is a parametric 
family of cumulative hazard functions with hazard rate a{u,6,z), and is the "true" param- 
eter. It is assumed throught the paper that the parameters of the conditional distribution of 
the censoring times are non-informative on {T,6). 

Let N{t) = 1{X < t,6 = I) and Y(t) = 1{X > t) denote the counting and risk processes 
associated with the pair [X, 5) . We also set 

To = sup{t : EY{t) > 0} 

and assume the following regularity conditions. 

Condition 1 

(i) The covariate Z has a nondegenerate distribution ^ and is bounded: IJ,{\Z\ < C) = 1 for 
some constant C. 

(ii) The function EY{t) has at most a finite number of atoms, and EN{t) is continuous. 

(iii) The point r > satisfies mi{t : E[N{t)\Z = z] > 0} < t for fi a.s. z. In addition 
r < To if To is a continuity point of the survival function EY{t), and r = tq, if tq is an 
atom of this survival function. 

(iv) The parameter set G C i?*^ is open, and the parameter 6 is identifiable in the core model: 
Oj^e' iff A{;e\z)^A{-,e'\z) n a.s. z. 

(v) There exist constants < mi < m2 < oo such that the hazard rate a satisfies 

mi < a{x,9,z) < m2 (6) 

for ^ a.s. z and all G 0, or (6) and (vi) holds for a{x,6,z) = a{^{x),9, z)^' (x), 
where <I> a strictly increasing unbounded twice continuously differentiable function $ such 
that $(0) = 0. 

(vi) The function i(x, 9, z) = log a{x^ 9, z) is twice continuously differentiable with respect to 
both x and 9. The derivatives with respect to .x (denoted by primes) satisfy 

\i'{x,9,z)\<i^{x), \^"{x,9,z)\<^P{x) , 

where ip is a constant or a continuous bounded decreasing function. The derivatives with 
respect to 9 (denoted by dots) satisfy 

\i{x,9,z)\ < ^i(x), \iix,9,z)\ < Mx) 
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and 

\g{x,e,z)-g{x'.,e,z)\ < ^3(a;)[|x - a;'| + 1^ - ^'|] , 

where g = £,£' and £" . The functions tpp,p = 1,2,3 are continuous, bounded or strictly 
increasing and such that V'p(O) < oo, 

/•oo roo /•oo 

/ e~^il)\{x)dx < OO, / e~^ip2{x)dx < oo, / e~^ip3{x)dx < oo. 
Jo Jo Jo 



The assumption that the covariate Z is bounded is restrictive, but standard for analysis of 
semiparametric models assuming that the transformation T is unknown. In the special case of 
the proportional hazard model, Andersen and Gill (1982) required only existence of moments 
EZ'^e'^'^^l{X > x), for .t > in a neighbourhoood C R'^ of the true parameter Oq. However, 
setting X = 0, wc sec that this moment condition may lead to a constrained optimization 
problem which cannot be correctly stated, if the distribution Z is unspecified. For example, 
if Z is multivariate normal N(0, S) and S is a known non-singular matrix, then the moment 
condition is satisfied for all 9 e and the usual unrestricted partial likelihood approach 
towards fitting the regression coefficients applies. However, if Z is a univariate lognormal 
variable, Z ~ expAA(0, 1), then the parameter 9 must be estimated under the added side 
condition 9 < 0. Thus the boundedness assumption is restrictive, but allows for parameter 
estimation without additional assumptions on the marginal distribution of the covariate. 

Given an iid sample {Ni,Yi, Zi),i = l,...,n of the {N,Y,Z) processes, we set N,{t) = 
n-^Ni{t), 



1 

S{x,9,t) = -J2YiitHix,d) ■ 



and ai{x,9) = a{x,9\Zi). Following Bogdanovicius and Nikulin (1999), define 

Jo ^\i-ne{U-),9,U) 

for any 9 & Q. The process {T^e : 6* G 0} is here thought as the sample analogue of the 
Volterra integral equation 

Jo s{Tg{u-),9,u) 

where s{x, 9, u) = EYi{u)ai{x, 9). The condition 1 (iv) was used in Dabrowska (2005) to verify 
that this equation has a unique locally bounded solution, and such that Tq{tq) < oo if tq is 
an atom of the survival function EY{t)., and \im.t^^-^^ ^e{^^ T oo, if tq is a continuity point of 
EY{t). In particular, the latter applies to unccnsorcd data. Therein wc show that in the case 
of scale transformation models (2), the condition 1 (v) is satisfied by half-logistic, half-normal 
and half-t distributions, proportional odds ratio distribution, frailty models with decreasing 
heterogeneity with fixed frailty parameter and polynomial hazards with nonnegative constant 
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coefficients. These models have smooth differentiable hazards with respect to both x and 6 and 
integrability conditions 1 (vi) imply also that Fisher information is finite. Affine independence 
of covariates is sufficient for the condition 1 (iv) to hold. In the case of transformation models 
(1), the regularity conditions are satisfied in the gamma frailty model with frailty parameter 
representing a function of covariates dependent on a Euclidean parameter. They are also satis- 
fied in regular polynomial hazard regression models with nonnegative coefficients representing 
parametric functions of covariates. In these models, the conditional hazard rates are twice 
differentiable with respect to x, while the condition 1 (vi) imposes a second order differentia- 
bility assumption on the functions of covariates. Such differentiability conditions arc in general 
not needed in regular parametric models. However, here we use semiparametric models and 
estimation of the parameter 9 will be based on a conditional rank statistics score equation. 
We do not know at present time, how to relax these differentiability conditions to allow for 
estimation based on ranks. 



For any r satisfying condition 1, the function \T0{t) : t G [0, r], G 0} is Frechet differen- 
tiable with respect to 9 and the derivative satisfies the linear Volterra equation 

te{t) = - j s{Te{u-),9,u)Ce{du)- I te{u-)s'{Te{u-),9,u)Ce{du) , 
Jo Jo 

where s{Tg{u-),9,u) = EYi{u)ai{Tg{u-),9), s'(T0{u-),9,u) = EYi{u)a'i{re{u-),9) and 

EN{du) 



Ce{t) 



s^{Te{u~),9,u) 



In the case of the proportional hazard model, the function s' is identically equal to 0. Otherwise, 
the solution to this Volterra equation is given by 

te{t) = - [ s{rg{u-),9,u)C0{du)rg{u,t) , 
Jo 

Ve{u,t) = 7r^^,t]{^-s'{Tg{w-),9,w)Cgidw)) . 
Here for any function b of bounded variation, 7r(u,t](l + b{du)) is the product integral, i.e. 

7r^u,t]i^ + b{dw))= H (l + 6(A^))exp[6e(0] 

u<w<t 

where be is the continuous part of 6 and the product is taken over its atoms. To make the 
definition complete, in the case of the proportional hazard model we set V0{u,t) = 1. With 
this choice, the form of the function Tg is the same for all models of type (1) considered in this 
paper. 

Let ai{x, 9) = a(x, 0, Z,) and li{x, 9) = log a(x, 9, Zi). We shall apply the same convention 
to derivatives of the functions a, and £i with respect to 9 and x. Define functions 



-r m EY,{u)[ira,]iT0{u),9) f s\ 

= sir0iu),9,u) UJ i^oiu),9,u) 
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S(Tg{u),e,u) \sj 

s{T0{u),e,u) \sj \ sj 



and 



HAt' 

Ke{t,t') = / Ce{du)Ve{u,t)Ve{u,t') 
Jo 

Bg{t) = [ v{u,e)EN{du) . 
Jo 

Suppose that v{u, 9) ^ a.c.-EN and let ipe = Jq gedTg be a vector valued function with d 
components and square integrable with respect to Bg. 

Define matrices 

= r v^{t,e)EN{du) 
Jo 

^2(0) = / / Ke{t,u)p^{t,e)p^{u,efEN{du)EN{dt) 
Jo Jo 



'0 Jo 
= Si(^) + S2(e) 



where 



(8) 



p^it,e) = p{t,e)-vit,e)Mt) ■ 

In the following we choose (fo as solution to the Fredholm equation 

Mt)+ Ke{t,u)v{u,e)<p0{u)EN{du) = -t0{t)+ K0{t,u)p{u,e)EN{du) . 
Jo Jo 

or equivalently 

<fe{t) + te{t) = [ Kg{t,u)p^{u,e)EN{du) = 
Jo 

= / Ke{t,u)p_^{u,e)EN{du)- Ke{t,u)[ipe + te\{u)Be{du) . 
Jo Jo 

This equation has a unique solution, square integrable with respect to Eg. We define it as 
ipe = —Vq if p_-p{u,9) = 0. In this case we have S2(0) = 0. Finally, if v{t,6) = a.e. EN, 
then p{t, ^) = as well. For the sake of completeness we, set in this case (pg = —Tq. We also 
have T,2{0) = 0, and '^-[{O) simplifies to T,i{6) = J v{u, 9)EN{du). This last choice corresponds 
to the proportional hazard model, and the scale regression models with regression coefficient 
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9 = 0. (Note that if v{u, 6) = 0, then the ipe function does not enter into the score equation 
below) . 



To estimate the parameter 6, we use a solution to the score equation Un{0) = 0, where 

-. n „^ 

Un{0) = - E / [bli{rne{t),t,9) - b2i(rr,0{t),t,9)ipr^0{t)]Ni{dt) , (9) 
i=l 

(Pn0 is an estimator of ipo, and 

If To is a known function, e.g. To{t) = t, then under the assumption of conditional independence 
of failure and censoring times, the MLE score equation for estimation of the parameter 9 is 
given by C/„(0) = 0, where 

Uni9) = -y2 r UMt),9)Ni{dt)- r SiToit),9,t)To{dt) 
n Jo Jo 

and S{x,9,t) = n^^'^^^iYi(t)d!i{x,9). In addition, the assumption of conditional indepen- 
dence of failure and censoring times implies that the function (7) satisfies Tg^^{t) = ro(t) at 
the true value 9o of the parameter 9. This last identity remains to hold also when the trans- 
formation Tq is unknown. Therefore a natural approach to estimation of the parameter 9 is to 
consider solving the score equation Un{9) = 0, where 



Un{e) = -j2 r hiiT^9it),t,9)Niidt) 
" i=l -^0 



In particular, this is the usual score equation for estimation of the parameter 9 in the propor- 
tional hazard model. In general transformation models (1), this choice leads to an asymptoti- 
cally inefficient estimate of the parameter 9. It may also lead to estimates of poor performance 
in moderate sample sizes. This also applies to score processes of the form (9), where ipnO is an 
estimate of some square integrable function Lpg with respect to Bq. For example, Bogdanovicius 
and Nikulin (1999) considered the choice of —tg, corresponding to the score equation derived 
from a modified partial likelihood function. Under mild regularity conditions on the estimator 
of the the function (pg, the solution to the score equation (9) exists with probability tending 
to 1 and is unique in local neighbourhoods of the true parameter 9o- However, its asymptotic 
variance assumes the usual "sandwich" form because the process VnO has a non-trivial contri- 
bution to both asymptotic variance of the score process and the negative derivative of it with 
respect to 9. The choice of the ipg function corresponding to the solution of to the Fredholm 
equation (8) leads to an M estimator whose asymptotic variance is of non-sandwich form and 
equal to the inverse of the asymptotic variance of the score function. The form of the solution 
to this equation can be found in Dabrowska (2005) . The resulting estimator can also be shown 
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to be asymptotically efficient under the assumption that the point tq = sup{t : EY{t) > 0} 
forms an atom of the survival function EY{t). The following proposition summarizes some 
properties of the estimates of (^,r). 

Proposition 1 Suppose that the conditions 1 are satisfied. Let Si(^o) be non-singular, 
and let (pn0 be an estimator of this function such that \\<Pneo ~'^6'olloo -^P 0, limsup„ ||(^neo||i; = 
Op(l), ifne - ^nd' = (0 - 0')'>Pne,e', where 

sup{limsup ||V'ne,0'lk ■ ^ € -B(6'o,e„)} = Op{l) 

n 

and B{6Q,en) = {0 : ||^ — ^o|| < ^n} for some sequence | 0,^/nen oo. Then, with 
probability tending to 1, the score equation Un{0) = has a unique solution 9 in B{9o,£n)- 
Moreover, [f,Wo],f = ^0 - Oq), Wq = v^[r^g - -{9- eo)r^] converges weakly in 
RP X £°°([0,t]) to a mean zero Gaussian process [T, Wq] with covariance 

cov T = E-\9o) cov {Wo{t),T) = -E-\eo)[ipeo + r^olW 
COY {Wo{t),Wo{t')) = Ke,{t,t') . 

An example of an estimator of the function ipg is given in Section 3. The asymptotic 
covariances can be estimated using substitution method. 

Let us assume now that V = {Dj : j = 1, . . . , fc} is a finite partition of the covariate space 
such that 

7r(D) = P{Z eD)>0, Dev. (10) 

We denote by Foit) = P{T G t\Z G D) the cdf of the conditional distribution of T given 
Z e D,D e v. Under the assumption of the transformation model, this function is of the 
form 

FD{t) = :;^E1[Z G D]F{ro{t),9o\Z) . 

In practice, the partition D will be chosen based on the observations. For example, if Z = 
{Zi, . . . , Zd) is a multivariate covariate, whose first component is continuous, then a natural 
partition of the covariate space may correspond to selection of fc = 4 intervals determined by 
the sample quartiles of Zi. If subjects are ranked according to values of the exponential factors 
^ than a natural partition may correspond to several groups determined by the distribution 
of ^ . Any selection of such a partition requires some form of estimation of parameters 
of the marginal distribution of the covariates. Here we consider a naive situation in which 
the cell probabilities can be estimated nonpar ametrically by means of sample proportions. 
This choice arises in analyses of models with possibly high-dimensional discrete or mixed 
discrete-continuous covariates, whenever interest is only in analyses of marginal conditional 
distributions corresponding to discrete variables representing treatment types, patients' gender 
etc. In the data example given in section 3, a many valued discrete variable representing a 
quantitative measurement patient's performance status, admits a natural partition into three 
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groups corresponding to a more intuitive qualitative description of health condition at the time 
of entry into the clinical trial. 

As an estimate Fnit) of the function Fr){t) we take 

1 " 



■ 1 

1=1 

We also define scalar and vector valued functions 



^ ' i=l 

where F{x,6\z) is the derivative of F{x,6\z) with respect to 9. 

Finally, we denote by || • || the supremum norm on T = [0,r] x V and let £°°{T) be the 
space of bounded functions on T endowed with the supremum norm. 

Proposition 2 Suppose that the conditions of Proposition 1 are satisfied and (10) holds. 



(i) We have ||F - F|| and = {W{t,D) = ^[F{t,D) - F{t,D)] : {t,D) G T} 
converges weakly in (T) to W, a mean zero Gaussian processes. Its covariance function 
is given in Section 4. 

(ii) Let Vi = iVii,V2i),i = 1,2, ...,n and V3 = (V31, . . . , Vs,^) be mutually independent 
A/'(0, 1) variables, independent of the observations (Xj, Si, Zi),i = 1, . . . , n. Define 

w*{t,D) = w*{t)Mt,D)+ rw*{s)p^js,e)NXds)i:-^ 

Jo 

where 
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and Sin(^), S„(^), (pn = f,^^g, pf^{u,9) and, V^^^{u,t) are estimates of Si(6'o), S(6'o), 
(/Jgo) PveS'^-'^o)-, 'Pea{u,t) obtained using substitution method. The process W* = 
{W*{t, D) = Wf{t, D) : {t, D) e T} converges weakly (unconditionally) in ^°°(r) 

to a Gaussian process with the same covariance function as the process W of part 
(i) and independent of it. Conditionally, the process W converges weakly to W in prob- 
ability. 

The proof is given in Section 4. In the first part of the proposition, the observations 
Ri = {Xi,Si, Zi),i = l,...,n,.... arc defined as coordinate projections on the product 
probability space (0,°° ,J^°° , P°°). In the second part, we use the product probability space 
(0°° X V X V, J'"^ X B X B',P°° X Q X Q'). The variables Ri = (Xj, 5^, Z^), i = 1, . . . , n, . . ., 
Vi,i = l,...,n... and V3 are defined as first, second and last projections. Conditional weak 
convergence in probability means 

sup \E^f{W*)- Ef{W)\^0 
feBLi 

in (outer) probability, where BLi is the set of all real functions on £°°{T) with a Lipschitz 
norm bounded by 1 (van der Vaart and Wellner, 1996, Ch. 2.9). 

We proceed to the discussion of the properties of the quantile regression. For p G (0, 1) and 
(fixed ) L» G P let 

£d{p) = inf{t : FD{t) > p} , uoip) = sup{t : FD{t) < p} . 

Then ioip) ^ uo{p) and the p-th quantiles of the conditional distribution of T given Z £ D 
arc defined as the set of numbers in the closed interval [iD{p),UD{p)]- We denote by £d{p) and 
ud{p) the sample counterparts of these points, i.e. 

£d{p) = ini{t : Foit) > p}, ud{p) = sup{t : Foit) < p} . 

If ud{p) < T, then under assumptions of Proposition 2, we have 

(■d{p) < liminf ££)(p) < limsup«£)(p) < ud{p) (11) 

" n 

with probability tending to 1. Indeed, let e = s{D) > be arbitrary but small enough so that 
ud{p) + £ < t. Then 

Fd{(-d{p) -s) <P, Fd{ud{p) + e)>p 

and uniform consistency of the estimate Fd{-) implies that with probability tending to 1, we 
also have 

Fd(^d{p) -£)<P, Fd{ud{p) +£)>p. 

This in turn implies (11). 
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In the following we shall assume that the transformation function Tq has density 7 with 
respect to the Lebesgue measure, and the function 7 is uniformly continuous and bounded 
away from on an interval [0, ri — e, r2 + e], < ri — e, r2 + e < r < tq and such that 

n = mm{iD{pi) -.DeV}, T2= max{uD{p2) : D eV] . (12) 

Let / = [pi,P2] and set 1 = 1x1). In this case the conditional distribution of T given Z £ D 
has a unique p-th quantile Q d {p) for any p E I and we define its sample analogue by setting 

Qoip) = Mp) = : Fnit) > p} • 

Then (11) implies that Qd{p) -^p Qd{p) pointwise in {p,D) G I. Using finiteness of the class 
V, monotonicity of Fnit) and Fnit), and an argument similar to the classical Glivenko-Cantelli 
theorem, we also have 

sup{|Qd(p) - Qd{p)\ ■ ip, D)eI}^pO. 

Proposition 3 Suppose that the conditions of Proposition 2 hold, and Tq has density 7 
with respect to the Lebesgue measure such that 7 is uniformly continuous and bounded away 
from on an interval [0,ti — e,T2 + e],0 < ti — e,T2 + e < t satisfying (12). The normalized 
quantile process V = {V{p, D) : (p, D) G X} given by 

V{jp,D) = ^[QD-QD]ij>) , 
converges weakly in ^°°(X) toV = {V{p, D) = -h{p, D)W{Qd{p),C) : {p, D) G X}, where 

h{p,D) = [fD{QD{p))l{QD{p))]-^ . 

Proof . We have V{j), D) = h{p, D)R{p, D), where 

Hp,d) = ( ]{p), 

\FdoQd-FdoQdJ' ' 
R{jp,C) = MFdoQd-FdoQdKp) . 

Since the function 7 is positive and uniformly continuous on [ti — £,T2 + s], uniform consistency 
of the sample quantile function implies 

sup{|^ - h\ip, D) : (p, D) G X} . 

The process R{p, D) is on the other hand given by R{p, D) = Y^j=i ^jiPi -^)) where 

Ri(jp,D) = -{WDoQD)(ph 

R2{p,D) = -{WdoQd-WdoQdM , 

% {p,D) = ^/K[FD o Qd {p) - p] . 
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Wc have sup{\R{p,D)\ : {p,D) e 1} < sup{\Wd{u) - Wd{u-)\ : u e [n - e,T2 + e],D G 
T>} = Op{n~^/'^) because the function -Fd(x) has jumps of order Op{n^^). Apphcation of the 
Skorohod-Dudley-Wichura construction imphes also that sup{|i?2(p, : {p,D) G 1} —>p 0, 
while the process {Ri{p,D) : {p,D) G X} converges weakly in £°°{I) to {—Wd ° Qd{p) '■ 

{p,D)el}. □ 

We shall apply now this result to construct pointwise confidence intervals for the p-th. 
quantile. Let VD{t) be the asymptotic variance function of the process {W{t, D) : {t, D) G T}. 
It is derived in Section 4. Here we shall use only that this function is positive and continuous 
on the interval [ri — T2 + e], and its its plug-in analogue voit) is uniformly consistent on the 
set [ti — £, T2 + e] x V. 

For p G (0, 1) and D eV, let 

Pn =P^-r^D{QD{p))z{a) , 
\ Tl 

where z{a) is the upper a/2 percentile of J\f{0, 1) distribution. Proposition 3 and the inequal- 
ities 

Qd{p)>s iff p>Fd{s), 
Qd{p)>s iff p>FDis), 

imply that [QoiPn)-! QoiPn)] is a 100% x {1 — a) asymptotic pointwise confidence interval for 
the conditional quantile Qoip)- 

Unfortunately, in practice the points p^ may fall outside the range [0,1]. To circumvent 
this problem, we follow the approach of Bie et al. (1987) and consider confidence intervals 
based on transformations. Let 5 be a strictly monotone cdf with density g' supported on the 
whole real line. Set 



± -1/ ^, 1 vd{Qd{p)) , . 



With probability tending to 1, the inequalities 

< Qnip) < QDigip^o)) 

are equivalent to 

-^(a) <9\3 [P)Wn ^ ,^ , < z{a) 

vd{Qd{p)) 

and application of delta method implies that [Q_d(p^£i), Qi3(j'^£))] is a 100%x (1— a) asymptotic 
confidence interval for the conditional quantile Qd{p)- 

Construction of simultaneous confidence sets for the function {Qd{p) '■ {p, D) G 1} is more 
difficult because the process W appearing in Propositions 2 and 3 forms a sum of independent 
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Gaussian processes with correlated increments. Therefore, following Burr and Doss (1993) and 
Lin, Fleming and Wei (1994), we propose the use of simulated confidence sets. 



Define 

vd{Qd{p)) VD{t) 

and let u{a) be the upper 100%(1— a) percentile of its distribution. To obtain an approximation 
to the critical level u{a), we generate mutually independent standard normal vectors V defined 
as in Proposition 3, and form 

U* = sup{ '^5^:f^' : t G [Qd{pi),Qd{P2)],D G V} 

The procedure is repeated independently m times, for some large m, to obtain m iid copies 
. . . , Um- The estimate u^{a) of the critical point u{a) is taken as the empirical (1 — a) 
quantile of Uf, . . . , Um- The corresponding simulated confidence set for {Qd{p) '■ {p, D) G X} 
is chosen as 

{[QD{p-nD)^QD{plD)]--DeV] , 



where 



Application of Propositions 2-3 implies that {a) , the upper a-quantile of this (conditional) 
distribution satisfies u^{a) — u{a) in probability. 



An alternative approach to construction of simultaneous confidence sets may be based on 
bootstrap. Lin, Fleming and Wei (1994) argued that in the case of Cox regression with external 
time dependent covariates, it is not clear how to implement bootstrap to construct simultane- 
ous confidence bands for the conditional survival function, or other functionals related to it. 
In our setting covariates are time independent, and confidence sets can be based on "obvious" 
bootstrap. We can draw i?* = [{X*,5*,Z*) : i = 1, ...,n] by sampling with replacement 
from the empirical distribution function of the [{Xi,6i, Zi) : i = l,...,n] observations For 
each sequence i?*^- : j = 1, . . . ,m we can compute bootstrap estimates {Q£){p), ip,D) G 1} 
and next use them to approximate the distribution of the quantile process. Although it is 
possible to show consistency of this procedure, its drawback lies in the computational burden 
needed to construct estimates (0*j,r*g, ) for each of the m simulated data sets. In the case 
of the proportional hazard model, Hjort (1985) proposed the use of "model based" bootstrap. 
Burr and Doss (1993) applied it to the construction of simultaneous confidence bands for the 
conditional median. In this approach, the distribution of the quantile process is approximated 
based on artificial observations {X* ,S*),i ^ 1. . . . ,n defined as X* =T*AT*, 5* = 1{T* <f*), 
where T* is sampled from the distribution F{T^^g{t),6\Zi) and T* is sampled from G{t) = 1 — 
Kaplan-Meier estimate of the censoring distribution. This approach uses the assumption that 
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censoring time is independent of covariates, which need not be satisfied in many practical situ- 
ations. It is in principle possible to relax it by chosing a parametric or a semi-parametric model 
for the conditional distribution of censoring times, however, selection of such a model is often 
quite difficult, and its misspecifaction may affect the performance of confidence procedures. 



3 Example 



For illustrative purposes wc consider now data from the Veteran's Administration lung cancer 
trial (Kalbfleisch and Prentice, 2000). In this trial males with inoperative lung cancer were 
randomized to either a standard or an experimental chemotherapy treatment and subsequently 
followed until death or withdrawal from the study. We shall look at the subgroup of 97 patients, 
who received no prior therapy, and use two covariates corresponding to performance status at 
the time of entry into the clinical trial and histopathological type of tumor (squamous, small 
cell, adeno and large cell). 

Several authors (e.g. Bennett ( 1983), Pettit (1984), Cheng et al. (1995) and Murphy, 
Rossini and van der Vaart (1996)) proposed the use of the proportional odds ratio for analysis 
of this dataset. Our estimates are easy to compute in this case because the hazard rate of the 
i-th subject satisfies 

ai{x,6) = e^^^^{l + e^^^'x)-^, £'i{x,e) = -ai{x,e), ii{x,e) = Z^e'^'^ ^'ai{x,e) . (13) 

For fixed 0, the estimate V^e is computed based on the recurrent formula given by Bogdanovi- 
cius and Nikuhn (1999): 

r„,(t) = r„,(H + ^(r„,(t-),^,t) 

with the initial condition r„g(0— ) = 0. The sample version of the function can be evaluated 
as 

S^{Tne{t-),d,t) 

and FnoiO—) = 0. The solution to the Predholm equation can be obtained as follows. Let 
X(_i^ < . . . < , m < n be the distinct uncensored observations in the sample. Dropping 
dependence on the parameter 6, let Bn,Cn be the plug-in sample analogues of the functions 
Bff and Cg. These are step functions with jumps at points X^j) and we arrange their jumps 
into m X m diagonal matrices Bn(AX) = diag {S„(AX(j)) -.1 = 1,... ,m}, and Cn(AX) = 
diag {(7„(AX(j-)) : i = I, . . . ,m}. let Pn(X) be an m x d matrix of the sample analogues of 
the conditional covariances p_-p{u, 9) at points -^(j), i = 1, . . . , m. (Here d is dimension of the 
parameter 6). The matrix Cn(AX) has positive entries, the matrix Bn(AX) nonnegative. 
If Bn(AX) = then also /9n(X) = 0. Setting '^nO = ^n9 + i'nO, the discrete version of the 
Predholm equation corresponds to 

[I + K„(X)B„(AX)]V'n(X) = K„(X)p„(X) , 
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where V-'n(X) = : i = 1, . . . , m]'^ is an m x (i matrix of unknowns, Kn(X) is an m x m 

matrix with entries Kn(X) = , X(j))] and I represents an m x m identity. IfBn(AX) = 

or pn(X) = then the solution is V'n(X) = 0. Otherwise, V'n(X) = P^^(X)g-^(X)Pn(X)pn(X), 
where gn(X) = [gij] is a tridiagonal symmetric matrix with entries ga = Ci + Cj+i + 6,, gi^i+i = 
-Q+i = gi+i,!,^ = 1, . . . , m - 1 and g„nn = c„, + 6„, where hi = Vne{0, ^(i))^-Bn(AX(j)), 6, = 
P„e(0,X(i))2C„(AX(,))-i and P„(X) = diag [exp(- ,,^^^] 5'(r„e(n-), n)C„9(d^.)) : i = 

1, . . . , m] ~ diag [Vnei^, i = 1, . . . ,m\. (Dabrowska, 2005). After obtaining the solution, 

tpnO we set ifnO = ipnO ~ i^nO- The estimate 9 can be obtained using Fisher scoring algorithm. 
The algorithm can be started by setting 9^^^ obtained by solving the same score equation, but 
function (p^g set to or —Tne- 

The estimate F^^j is a cadlag step function with jumps at unccnsorcd observations, and so is 
the estimate -Pz)(i) of the conditional distribution function of T given Z ^ D. Thus the graph 
of the quantile function can be obtained by inverting graphically the plot of this function. The 
estimate VD{t) of the asymptotic variance of the v^[-^D — FdW) and the process W'^{t,D) 
can be easily computed based on expressions given in Sections 2 and 4. 

Table 1 provides regression coefficients and their standard errors for the Veteran's Admin- 
istration lung cancer data. In this data set the performance score (PS) has range between 10 
and 99, with lower values indicating poorer performance status at the time of entry into the 
trial. This covariate was used in the regression model after standardizing it to have average 
zero and standard deviation 1. The negative sign of the regression coefficient indicates that 
patients with higher performance score have lower odds on death and thereby a better survival 
experience. Patients with squamous tumor have a slightly lower odds on death than large cell 
tumor patients, however, the difference is not significant. Patients with adeno or small cell 
tumor have higher odds on death than patients with squamous or large cell types. 

Table 1 about here 

We shall consider now two partitions V of the covariate space. In both cases, we shall 
consider quantile regression estimates in the range p G (.25, .75). Simultaneous confidence sets 
are based on the transformation g^^{p) = log(— log(l — p)) and we used 1000 Monte Carlo 
simulations of the V vectors (section 2) to obtain the critical points. 

The first partition corresponds to the four histopathological types of tumor. Figure 1 
shows the corresponding quantile regression and confidence set for the conditional quantiles. 
The plots support results of Table 1 and show that patients with squamous or large cell tumor 
perform better than patients with adeno or small tumor cells. However, within each pair of 
tumor types, the confidence sets are nearly the same so that the differences are small. 

Figure 1 about here 
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Next we partition the covariate space according to the performance status at the time of 
entry into the trial. We consider patients, who are completely hospitalized (PS < 40), partially 
confined (PS G [40, 70)) and who are not able to care (PS > 70). In Figure 2, the confidence sets 
for the hospitalized and partially confined patients nearly overlap, suggesting similar survival 
experience after treatment. This experience is much worse than for patients who are not able 
to care. For example, the estimated median time till death for hospitalized, partially confined 
and unable to care patients is 25, 29 and 110 days, respectively. The corresponding confidence 
bounds are (22,35), (24,36) and (103,112) days. Figure 3.2 suggests also that effect of the 
PS score is not linear, and a regression model using a binary covariate: Z = 1(0) if PS score 
> (<)70 may be more appropriate. 

Figure 2 about here 

We have also considered the choice of the proportional hazard model and generalized inverse 
Gaussian frailty model. In each of these models the regression coefficients had the same sign, 
however, neither of the transformation models could be fully justified. In Figure 3 we show 
nonparametric plots of the Aalen-Nelson estimator, odds ratio function and Kaplan-Meier 
estimator of the survival function for the four tumor cell types : squamous c (solid line), large 
(dotted line), small (short dash) and adeno (log dash). The plots of the cumulative hazard 
function of the large and squamous cell type cross at around 150 days. Patients with squamous 
cell type are initially at a higher risk for death but at around 150 days after treatment the 
role of the two groups is reversed. The corresponding plots of the odds ratio function suggest 
that the choice of proportional hazard model may not be appropropriatc and that odds ratio 
functions arc close for the two groups. In the case of the adeno and small cell tumor cell type 
groups, the graphs of both cumulative hazard and odds ratio functions cross only at the upper 
tail, however, the two groups can be only compared during the initial 180 days. 

Figure 3 about here 

These graphs illustrate typical difficulty arising in regression analyses based on transfor- 
mation models of type (1) or (2). The transformation models assume that the conditional 
distributions of the failure time T given Z = z have the same support as the marginal distri- 
bution of T for /i- almost all z. This assumption fails to be satisfied in the fully nonparametric 
setting, not assuming any restrictions on the support or shape of the conditional distribution 
of T given Z = z. If F{t\z) represents the conditional distribution function of T given Z = z 
and G is the corresponding marginal distribution function of T, then setting 

Ti{z) = mi{t : F{t\z) > 0} T2{z) = snp{t : F{t\z) < 1} 
n = inf{t : G{t) > 0} ra = sup{t : G{t) < 1} 
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wc have Ti < Ti{z) < T2{z) < T2 for yU-almost all z, i.e. the marginal distribution of T has 
longer support than the conditional distributions. For different covariate levels zi and Z2, the 
intervals [ti(zi) , T2{zi)] and [ti{z2),T2{z2)] may be very different. 

In the present example, large and squamous cell type patient groups have longer support 
interval than the groups of squamous and adeno cell types. Apparently, patients for whom 
treatment is beneficial live longer. The choice of the proportional odds ratio model appears to 
be more appropriate than the proportional hazards model, however, it does not accommodate 
variable support intervals of conditional distributions of different subgroups. The problem ap- 
plies to all transformation models of type (1) and (2). The plots of Kaplan-Meier estimators 
corresponding to the four groups are proper survival functions in this data example because 
data are lightly censored (Kalbfleisch and Prentice, 2000). In moderately or heavily censored 
samples, the grouped data Kaplan-Meier estimator will often form an improper survival func- 
tion. In such circumstances, variable supports of Kaplan-Meier estimator may indicate also 
presence of informative censoring. The difficuties in handling variable supports of conditional 
distributions apply also to other common parametric and semiparametric regression models in 
survival analysis and are very common in practical applications. 



4 Proofs 

In this section, we denote by Mi{t) the process 



where Fq = Tg^ is the "true" transformation. Then M, are independent mean zero martingales, 
with respect to natural filtration generated by = (^{iNi{s),Yi{s+), Z^) : s < t,i = 1, . . . , n}. 
For any measurable functions gq{u, z), q = 1,2 such that 



Mi{t) = l{Xi <t)- j Yi{u)ai{Te,{u),eo)Te,{du) , 




we have 




Lemma 1 Suppose that the conditions of Propositions 1 and 2 are satisfied. 
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(i) The estimate ^satisfies y/n\e - 6*0] = S((9o)"^\/n?7„((9o) + op(l), where S(6') = Si(6') + 
Y.2{e) and Un{eo) = n'^ Y!l=iPu{do) + U2i{eo)] is given by 



and 



UiiiOo) = / hi{Te,{u),eo,u)Mi{du) , 
Jo 

U2i{eo) = - Woi{t)p^,^{t,eo)EN{dt) , 

Jo s{ro{u-),eo,u) 

s s' 

The sums n~^/^ J27=i ^ii(^o) and n"^/^ Y17=i ^2i(^o) are uncorrelated and converge weakly 
to independent mean zero normal vectors with covariances Si(^o) and E2(^o)- Moreover, 

M^nd - r^o - [0- SolTeo] (*) = ^ E (*) + 

^ 1=1 

uniformly in i G [0,t]. 
(ii ) We have J^gniO) ^q{0o) for g = 1,2, 

ll^ne ~ ^"''o II 0' W^nd ~ ^rieo II , 

II ^ %(u,^)A^.(c?^) - p^eoi^,eo)EN{du)\\ , 
11/ ^{r^0{u-)Au)N,{du) - ^iT0,{u-),eo,u)EN{du)\\ ^pO, 
11/ ^{r„oiu-),e,u)N,{du)- J ^{Tg,{u-),eo,u)ENidu)\\ ^pO, 
limsupexp / ^(r^(n-),^,u)iV.(dn) =Op(l) 

n Jo iJ 

and Vg{u,t) -^p V0Q{u,t) uniformly in < « < t < r. 
(iii) Let 

Mt,D) = 7r{D)-^El{ZieD)f{ro{t),9o\Zi) , 

Mt, D) = X)U{t) + 7r(L»)-iEl(Zi G £')F(ro(t), ^o|^i) 

and let ^p,p = 1, 2 be the estimate of this function obtained by replacing the pair {9q, Fq) 
and the function t^{D) by (0, F^g-) and 7f(D). Then HV^^ — ■0g|| — >-p 0, g = 1, 2. 
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(iv) Part (ii) and (iii) remains to hold if the estimates {6,T^g) are replaced by (^*,r*) such 

that e* 00 and ||r; - r^oll 0. 



We omit the proof of this lemma. Part (i)-(ii) and (iv) can be found in Dabrowska (2005), 
while part (iii) is a straightforward consequence of part (i)-(ii). 



Proof of Proposition 1. We have 



where 



Wi{t,D) 
W2{t,D) 



i=i 



1 1 



^ l{Zi e D)[F{roit), eo\Zi) - Foit)] , 



i=l 



^VWoi(t)Vi («,£»)- V'2(t,£»)^S-i(^o)^Vc/2i(^o) , 

i=l 

— 1 " 

W3{t,D) = Mt,Df^~\do)^Y.Uu{eo) 

- W2{t,D)-Ws{t,D) . 



-^=-^ HZi e D)[F{rg{t, e\z,) - F{ro{t), 9o\Zi)] 



Here Wj{t, D),j = 1, 2, 3 represent uncorrelated sums of mean zero iid processes with finite 
variance and covariance 



cov {Wiiti,Di),Wiit2,D2)) = Tr{Di)Tr{D2r^El{Zi e DiH D2)F{h\Z)F(t2\Z) , 

- FD,{ti)FD,{t2) 

coy{W2{h,Di),W2{t2,D2)) = COY {Wo{t),Wo{t'))Mti,DMh,D2) 

+ Mti,DifcoY {Wo{ti),T)i;i{t2,D2) 

+ [Mti,DifcoY {Wo{ti),T)Mt2,D2)f 

+ Mti,DifVarTi^2it2,D2) 

- COY {Ws{ti,Di),W-i{t2,D2)) , 

COY {W3{h,D^),W3{t2,D2)) = Mti,Dif^-\eo)^i{eo)^-\eo)Mt2,D2) 

and, from section 2, 

cov T = ^-\eo), COY (T, Woit)) = -S-i(0o)K + te^Kt) , 
COY {Wo{t),Wo{t')) = Ke,{t,t') . 



(14) 
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We also have 

Jo 



By central limit theorem, finite dimensional distributions of the processes {Wj,j = 1,2,3} 
converge weakly to a multivariate vector with covariance matrix given by (14). 

For each j = 1, 2, 3, the process {Wj{t, D) : {t, D) € T} can be represented as 
^-1/2 Y^'^^^ h^pj^{Xi, 5i, Zi), with h^^^ varying over a Euclidean class of functions Tij = {/i^"'^ : 
{t,D) G T} for a square integrable envelope (Nolan and Pollard, 1987). This can be verified, 
by noting that is a finite collection of sets, and for each D e V, the relevant functions 
h)[jj G T-Lj can be represented as finite linear combination of functions of bounded variation 

with respect to t. We also have Eh^''j-){Xi,6i, Z^) = for each h^jj G Tij. Hence the process 

Wj = Gn,j = {\/n[Pn — P]{h^''j^) : h^^\) G Tij} is equicontinuous and TLj is totally bounded with 
respect to the variance semi-metric pj. Set p = max pj,j = 1,2,3. Then T is totally bounded 
with respect to p, and {Wj : j = 1,2,3} is asymptotically tight in £°°(T) and converges 
weakly to a Gaussian process {Wj : j = 1, 2, 3}. Its components are independent, and Wj have 
covariance function given by the right-hand side of (14). 

Using Taylor expansion, we also have W4{t, D) = W\i{t, D) + W42{t, D), where 
W^iit, D) = V^iVneit) - Te.it) - {9- eofto{t))r,it, D) 
- ^I]^oi(i)V'i(i,^) , 

W42{t,D) = ^i(t,D)'^^{e-eQ)-Mt,DYY.-\eo)VTiUn{eo) 

and 

1=1 

r2{t,D) = ri{t,D)to{t) + ^—j2HZieD)F{r*{t),0*\Zi). 

mriJJ) ^ — ' 
^ ' 1=1 

Here 9* is on a line segment between 6q and 9, and ||r* — Tgjl -^p 0. By Lemma 1, 

sup{\W4p{t,D)\ : {t,D)eT} 

for p = 1, 2. To complete the proof of part (i) of the Proposition 3, we note that 7r(D) — >■ 7r{D) 
a.s. for D eV so that W = {{Tr{D)/n{D))Y.'^j^-^Wj{t, D) : {t,D) G T} converges weakly in 
l'^{T) toW = {W{t,D) = Y.%iWj{t,D) : {t,D) G T}. Its variance function is given by 
VD{t) = X^,=ivar Wj{t,D). For any D this is a continuous function with respect to t and 
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positive on any interval [ri — e,T2 + s] on which Fg^ forms a continuous strictly increasing 
function. 



To show part (ii), first recall that Vi = {Vu, V2i), i = 1, . . . ,n, . . . and V3 = (V31, . . . , Vsa), 
are mutually independent A^(0, 1) variables, independent of Ri = (Xj, Si, Zi),i = 1, . . . ,n. We 
let variables Ri = i = 1,2, ... he defined as coordinate projections on the "first" 00 coordinates 
in the product probability space xVxV',J"^ xBxB',P'^ xQx Q') and let Vi,i = I, .. 
and V3 be defined on the "last" two coordinates. 



Set 



where 



~ 11 

Wi{t,D) = ^VumGD)[FiTe,{t),eo\Z,)-FDit)], 
W2{t,D) = Wo{t)Mt,D) + 1^ Wo{s)p^,^{s,eo)ENXds)J:-\eo)Mt,D) 



For j,k = 1,2, 3, j / k, we have 
cov {Wj{t,D),Wj{t',D')) = cov {Wj{t,D),Wj{t',D')) , 

cov {Wk{t, D), Wj{t', D')) = cov D),Wj{t', D')) = cov {Wk{t, D), Wj{t', D')) = . 

Also W3 does not involve n, the i?^, i = 1, 2, . . . or the Vji,j = 1,2, i = 1,2... sequences, and 
is independent of the processes Wj,j = 1, 2 and Wj,j = 1, 2, 3. 

Similarly to part (i), the processes {Wj{t,D) : {t,D) G T,j = 1,2} are of the form 
W,{t, D) = ^ YJl=i V^g^'^ {X^, 6„ Zi), where varies over = {g[%{x, d, z) : {t, D)eT},a 
Euclidean class of functions for a square integrablc envelope and is totally bounded with respect 
to the semi-metric p. The class of products {vg^-'l){x,6, z) : (t,D) G T) is also Euclidean. 
Therefore, unconditionally \Wj : j = 1,2] is asymptotically tight and converges to a Gaussian 
process \W"^ : j = 1,2], whose components are independent and independent of Ws and 

[Wi,W2,W3]. 

Alternatively, for j = 1, we have g^^l) = h[^]^ with Ph[^]^ = and 

W,{t,D) = ^Vri{5R^ - P)[9t,D] = ^^Vu5R,[gt,D] • 
V" i=l V" 
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For j = 2 



i=l V" .^1 

= VF2l(t,i^) + VF22 

and the two components on the right-hand side are uncorrelated. Application of the un- 
conditional multiplier central limit theorem in van der Vaart and Wellner (1996, Corollary 
2.9.4, p. 180) implies that the processes [VFi , W21 , VF22 , W3] and [VFi,VF2,M^3] converge jointly 
in [£°°(r)]4 X [£°°(r)]3 to independent Gaussian processes, [W* ,W*,W*,W* = W3] and 
[Wi, W2, W3]. By continuous mapping theorem, we also have unconditional weak convergence 
of [W = J2%iWj,W = Ej=i^j] in ^°^C^) X ^""C^) to a vector of independent Gaussian 
processes with the same covariance function. 

Gonditionally on Ri, R2, . . . , ... the processes Wi, W21 and W22 have mean zero, 

coy v[Wiiti,Di),Wi{t2,D2)] 

COV v[W2litl,Di),W2l{t2,D2)] 

COY v[W22itl,Di),W22{t2,D2)] 
COY v[W2l{tl,Di),W22it2,D2)] 

COY v[Wl{h,Di),W2jit2,D2)] 
COY v[W3{ti,Di),W2jit2,D2)] 
COYv[W3ih,Di),Wi{t2,D2)] 

for almost all Ri, R2, ■ ■ ■■ (Actually, conditionally on Ri, R2, ■ ■ Wj processes are indepen- 
dent). By conditional multiplier CLT, we have that conditionally on Ri, R2, . . . , the finite 
dimensional distributions of Wi and W2 are asymptotically multivariate normal and indepen- 
dent, for almost all i?i,i?2 • • •• The covariance function is the same as of finite dimensional 
distributions of Wi and W2- By continuous mapping theorem, we also have that condition- 
ally on i?i , i?2 , • • • , the finite dimensional distributions of W converge weakly to a multivariate 
normal distribution for almost all i?i , i?2 , The covariance of the multivariate normal dis- 
tributions is the same as the covariance of the corresponding finite dimensional distributions 
of 

Let BLi be the collection of functions / from £°°(T) into [0, 1] that are Lipschitz continuous 
with Lipschitz continuity constant equal to 1. For fixed S and x e T, let Ils{x) be the closest 



1=1 

n 

1=1 

i=l 

= 0, j = l,2, 
= 0, j = l,2, 
= 0, 
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point to X in T in a partition of the set T with mesh-width S (with respect to the semi-metric 
p). By triangular inequality 

sup \Evf{W)-Ef{W)\< sup \Ef{WoU5)-Ef{W)\ + 

sup \Ef{Wons)-Efv{Wons)\+ sup \EvfiWoUs)- Evf(W)\ 
feBLi feBLi 

= h + h + h . 

As in van der Vaart and Wellner (1996, p. 182), the term Ii converges to 0, because the 
process W has continuous paths with respect p and W oU.^ W in almost surely as (5 | 0. For 
fixed S > 0, I2 converges to for almost all Ri , i?2^ — This follows because conditionally on 
Ri, R2, . . the finite dimensional distributions of W converge in distribution to a multivariate 
normal vector, for almost all i22, Finally, 

h < sup Ev\f {Wo Us)- f{W)\<Ev,\\Wi°'n5-Wi\\g^,+ 

feBLi 

+ Ev,\\W2 o - W2\\g,, + EvAWs o - WsWg,, 

where Qjg = {g — g' '■ g, g' & Qj ■ pig — g') < S}, for j = 1,2,3. The first two expectation 
converge to as n 00 and (5 | 0, by Lemma 2.9.1 in van der Vaart and Wellner (1996, p 
177). The last expected does not depend on n, and converges to as 5 J, 0. 

It remains to consider the process defined in Section 2. We show that unconditionally 
\\Wj^ — WjW ^ in probability. If this is the case, then for e > 0, we have 

sup \E*yf{W*)-Ef{W)\ < sup \Evf{W)-Ef{W)\+ sup \E^f{W*)-Evf{W)\ 

feBLi f&BLi f&BLi 

< sup \EvfiW)-Ef{W)\+e + 2P^{\\W* -W\\> e) . 
feBLi 

The first term converges to in probability. The last term converges to in (outer) mean. 

Clearly, for j = 3, we have Sn(^) ^{(^0), ^2n{(^) ^2{Go) and — "iAilloo — > in 
probability so that \\wf — W3II — 0. 

Next, for j = 1, 2, 3, define 

1 " 

Hj{t,D) = -Y,Vul{Zi G D)h,t{Zi) , 

1=1 

where 

hit{Z) = 7r(£»)-i j = l, 

= 7r{D)-'f{Te,it),eo\Z) j = 2, 
= Tr{D)-'F{re,{t),eo\Z) j = 3. 
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We have EHj{t, D) = for {t, D) G T. Unconditionally, the strong law of large numbers, yields 
Hj{t,D) a.s. pointwise in {t,D) G T. The convergence is also uniform since for each D, 
the process Hj{t,D) has paths of bounded variation. We also have Wi — wf = Yl'j=i ^iji 
where 

Wii{t,D) = -y/^[FD-FD\{t)Hi{t,D) , 

Wi2{t,D) = MY^^-^-To-{e-eofte,]{t)H2{t,D) , 

Wi^{t,D) = V^[d-eof[t0,{t)H2{t,D) + H3it,D)], 

Wuit,D) = Op{i)-J2\Vii\o{V^\\r^^-Tg,f + v^ie-9of) . 

^ ■ 1 

1=1 

These four terms satisfy — > in probability (unconditionally) and the same holds for 
the process Wi — Wf. 

Finally, define 

M{t) = -^Y.V2il{Xi<t,5i = l) , 

i=i 

W(f\ = 1 Y-T. m<t,S^ = l) f M{du) 

V^tt S{T^^iX,-),9,X,) Jo SiT^^{u-),e,u) 

A similar argument as in analysis of the term W2 shows that W4 converges weakly (uncon- 
ditionally) to a mean zero time transformed Brownian motion with variance function C$Q{t). 
Since EN is a continuous function, so is Cg^ . We have 



Wt{t) - WA{t) 



Wi{du) . 



Denote the term in the bracket by a„(u— ). Then a„ is a process with left continuous and 
right-hand limits, ||a„|| -^p and 

limsup ||a„||^ = Op(l) , 

n 

where || ■ ||^ is the variation norm. For given S > 0, let ti < t2 < ■ ■ - tk he a partition of [0, r], 
such that C0Q{ti) — CooiU-i) < S. Define Ils{t) = if t G [ti-i,ti). Then integration by 
parts, yields 

W*{t) = f an{u-)^4 -W40 Us] (du) + f an{u-)[W4 o n^] (du) 
Jo Jo 

= [Wi -Wio Us] {t)an{t) + I [W4-mo Us] {u)an{du) + / a„(u-) [W4 o Us] {du) . 

Jo Jo 
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The right-hand side converges then to in probabihty uniformly in as n — oo, followed by 
5 — s- 0. We also have 



w*{t) = f w*{du)rg{u,t) , 

Jo 

Wo{t) = f W4{du)V0,{u,t) . 

Jo 



Then 



w*{t) = w*{t)- l^w*{u-)^{r^^{u-),e,u)NXdu) , 

Wo{t) = W4{t)- Woiu-)^{Te,{u-),eo,u)EN{du) . 
Jo ^ 



We have 



[W*{t)-Wo{t)] = Remit)- l\w*-Wo]{u-)^{T^^{u-),9,u)NXdn), 
Rem(t) = [Wf-Wilit) 

- J^Woiu-) (^^{T^^{u-),e,u)NXdu) - ^{ro{u-),eo,u)EN{du: 



We have ||Rem|| — and ||Rem || — in probability. Hence by Gronwall's inequality 
(Beesack (1975)) 

\wt-Wo\{t) < |Rem(t)| 
+ ^ \Remiu-)\^-^{r^^iu-)Au)NXdu)e^p jy-^{T^^{u-),e,u)NXdu) 

\S'\ 

< maxsup |Rem(t)|, |Rem(t— )| limsupexp / --^{T^g{u—), 0,u)NXdu). 

t<T n Jq O ^ 

Application of Lemma 1 and integration by parts implies that this term converges to in 
probability, and ||W(f — Wo|| — > in probability. Similarly, we have — W2II — > in 

probability. □ 

Acknowledgement. I thank an anonymous reviewer and Roger Koenker for comments. 
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Table 1. Regression estimates and standard errors 
in the proportional odds ratio model. 



covariate 


theta 


sd error 


p- value 


PS 


-1.049 


0.045 


< 10-5 


SQUAMOUS 


-0.246 


0.428 


0.71 


SMALL 


L345 


0.304 


0.01 


ADENO 


L275 


0.342 


0.02 


LARGE 


NA 


NA 


NA 
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Figure captions: 

Figure 1. Quantile regression and 95% simultaneous confidence bands. Covariate space 
partitioned according to four tumor types. 

Figure 2. Quantile regression and 95% simultaneous confidence bands. Covariate space 
partitioned into three groups according to the of performance status (Karnofsky) score. 

Figure 3. Aalen-Nelson, odds ratio function and Kaplan-Mcicr estimators for the four 
tumor cell types: squamous (solid line), large (dotted line), small (long dash) and adeno (short 
dash) . 
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